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1  Introduction 


In  most  practical  applications  of  term  rewriting  systems,  such  as  verifications  done 
with  the  Boyer-Moore  prover  [Boyer  and  Moore,  1979],  confluence  can  not  be 
achieved.  In  such  cases  attempts  to  prove  true  equations  often  fail.  In  this  paper 
a  new  term  rewriting  procedure  is  described  that  is  intended  to  improve  the  suc¬ 
cess  rate  for  proof  attempts  in  nonconfluent  theories.  In  cases  where  normal  term 
rewriting  fails,  success  can  sometimes  be  achieved  by  generating  a  set  of  normal 
forms  rather  than  an  individual  normal  form.  This  is  done  using  a  term  ordering 
where  a  single  term  can  have  many  normal  forms  all  of  which  are  minimal  under 
that  ordering.  Somewhat  surprisingly,  canonicalization  under  such  weak  term  or¬ 
derings  is  possible.  The  canonical  form  of  a  given  term  is  taken  to  be  the  act  of  all 
normal  forms. 

The  main  problem  with  using  sets  as  canonical  forms  is  that  the  sets  involved 
can  be  quite  large.  In  fact,  to  improve  the  success  rate  in  nonconfluent  theories 
we  would  like  the  canonical  sets  to  be  as  large  as  possible.  Fortunately,  large 
sets  can  be  compactly  represented  with  context  free  grammars.  A  finite  grammar 
can  represent  an  infinite  set  of  terms.  More  importantly,  very  large  finite  term 
sets  can  be  represented  by  compact  grammars.  Consider  the  equations!  theory 
consisting  of  the  single  commutativity  axiom  of  the  form  x  +  y  =  y  +  x.  In  this 
theory  the  equivalence  class  of  a  sum  of  n  constants  contains  order  2*  terms. 
However,  a  grammar  for  generating  this  class  contains  only  order  n  productions. 
As  another  example  consider  the  theory  consisting  of  a  single  associativity  axiom 
*  + (y+s)  =  (x+y)+z.  In  this  theory  the  size  of  the  equivalence  class  of  a  sum  of 
n  constants  is  the  Catalan  number  of  n  —  a  hyperexponential  function.  However 
the  equivalence  class  of  a  sum  of  n  constants  can  be  generated  by  a  grammar  with 
order  n3  productions.  In  the  equaiional  theory  that  contains  both  associativity 
and  commutativity  the  grammar  for  generating  the  equivalence  fbum  of  a  sum  of 
n  constants  grows  exponentially  in  n.  However,  the  grammar  is  still  vastly  wmalW 
than  the  equivalence  class  itself. 

Using  grammars  to  represent  equivalence  classes  involves  the  well  known  con¬ 
gruence  closure  procedure.  Congruence  closure  is  an  efficient  algorithm  for  de¬ 
termining  the  consequences  of  a  finite  set  of  ground  equations  [Kozen,  1977], 
[Shostack,  1978],  [Nelson  and  Oppen,  1980],  [Downey  et  a/.,  1980].  A  finite  set 
of  ground  equations  can  be  converted  to  a  grammar  that  encodes  the  congruence 
relation  on  terms  implicit  in  the  equations.  Congruence  closure  can  be  viewed  as 
an  algorithm  for  converting  ground  equations  to  grammars.  The  rewriting  proce¬ 
dure  described  here  operates  on  grammars  —  a  grammar  is  repeatedly  rewritten 
to  incorporate  new  ground  equations. 

The  rewriting  procedure  described  here  is  analogous  to  ordered  rewriting  gys- 
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terns  [Bachmair  et  al .,  1987],  [Hsaing  and  Rusinowitch,  1987],  [Martin  and  Nip- 
kow,  1990],  [Peterson,  1990].  It  rewrites  (representations  of  equivalence  classes 
of)  ground  terms  using  a  set  of  unordered  equations.  This  rewriting  is  done  in 
the  presence  of  a  well  founded  order  on  ground  terms  and  the  rewriting  process 
is  guaranteed  to  terminate.  However,  unlike  previous  ordered  rewriting  systems, 
the  order  on  ground  terms  is  not  assumed  to  be  total.  The  commutativity  equa¬ 
tion  x  +  y  =  y  +  x  can  be  handled  by  placing  both  sides  of  the  equation  into  the 
representation  of  the  set  being  rewritten.  Thus  the  grammar  rewriting  procedure 
described  here  provides  a  way  of  handling  non-orientable  equations  that  is  differ¬ 
ent  from  both  ordered  rewriting  and  from  the  use  of  special  unification  algorithms 
[Jouannaud  and  Kirchner,  1986]. 

The  grammar  rewriting  procedure  described  here  has  been  incorporated  into 
the  Ontic  verification  system  which  is  under  continued  development  by  the  au¬ 
thor,  Robert  Givan,  Carl  Witty,  and  Kevin  Zalondek.  Experimentation  with  the 
procedure  is  currently  under  way. 


2  Congruence  Grammars 

Congruence  grammars  provide  a  way  of  compactly  representing  equivalence  classes 
of  first  order  terms.  Each  equivalence  class  is  represented  by  a  nonterminal  symbol 
of  the  grammar  which  generates  the  elements  of  the  class.  Equivalence  classes  are 
always  disjoint  sets.  This  observation  motivates  the  restriction  that  no  two  produc¬ 
tions  of  a  congruence  grammar  can  have  the  same  right  hand  side.  For  example, 
we  can  not  have  X  —+  a  and  Y  —*  a  where  X  and  Y  are  distinct  nonterminal 
symbols. 

Definition:  A  Congruence  Grammar  is  a  set  of  productions  of  the 
form  X  — ►  /(l i,  •  •  • ,  y^),  where  /  is  an  n-ary  function  symbol  and 
each  Y{  is  a  nonterminal  symbol,  and  where  no  two  productions  have 
the  same  right  hand  side. 

We  use  the  standard  definition  of  a  first  order  term  where  we  assume  an  infinite 
set  of  function  symbols  of  each  arity  (number  of  arguments).  Constant  symbols 
are  treated  as  function  symbols  of  no  arguments.  So,  for  example,  the  production 
X  — *  a  is  a  production  of  the  above  form  where  a  is  function  of  no  arguments. 
The  above  definition  allows  us  to  prove  that  distinct  nonterminal  symbols  generate 
disjoint  classes. 

Lemma:  A  given  term  is  generated  by  at  most  one  nonterminal  of  a 
congruence  grammar. 
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Proof:  The  proof  is  by  induction  on  the  size  of  the  term.  No  constant 
(zero  ary  function)  can  be  generated  by  more  than  one  nonterminal 
symbol  because  this  would  imply  the  existence  of  two  productions  with 
the  same  right  hand  side.  Now  consider  a  term  t  such  that  all  terms 
smaller  than  t  are  generated  by  at  most  one  nonterminal  symbol.  Sup¬ 
pose  that  t  is  generated  by  two  nonterminal  symbols  X  and  Y  using 
the  productions  X  — »  /(Zlt  •  •  •  Zn)  and  Y  — *  f(W\,  •  •  • ,  Wn)  respec¬ 
tively.  Since  no  term  smaller  than  t  can  be  generated  by  more  than  one 
nonterminal  symbol  we  must  have  that  Z,-  equals  Wi.  But  this  violates 
the  assumption  that  no  two  productions  have  the  «attw»  right  hand  side. 


Example,  Representing  an  Infinite  Class:  Consider  the  equivalence  class  of 
the  constant  symbol  a  under  the  single  equation  a  =  /(a).  This  infinite  equivalence 
class  is  a  context  free  language  generated  by  the  following  two  productions. 

X  -  f(X) 

X-*a 

Example,  An  Equivalence  Class  under  Commutativity:  Consider  n  distinct 
constants  ai,  a?,  •  •  •,  a*  and  define  the  term  t,  to  be  the  sum  of  the  first  i  constants 
associated  to  the  left,  i.e.f  ti  is  a*,  t3  is  aj  +•  a3,  t3  is  (<*i  +  aa) + a3,  and  so  on.  Note 
that  for  1  <  t  <  n  we  have  that  f,  is  +  a,.  Suppose  that  +  is  commutative 
but  not  associative.  In  this  case  the  equivalence  class  of  the  term  tn  is  generated 
by  the  nonterminal  Xn  in  the  grammar  containing  the  2 (n  —  1)  productions  of  the 
form 

Xi  -»  (o,-  +  Xi_i) 

X%  —*  (Xi- 1  +  «,-) 

together  with  the  production 

®i« 

The  2"_1  terms  in  this  equivalence  class  are  generated  by  a  grammar  with  2n  —  1 
productions. 

Example,  An  AC  Equivalence  Class:  Consider  n  constants  ®i,  a?,  •  •  • ,  a* 
and  suppose  that  -+■  is  both  associative  and  commutative.  Let  S  be  any  non-empty 
subset  of  these  constants.  We  can  write  the  set  5  as  {a;,,  <*<,,  •  •  • ,  <nk }  where 
j  <  h  implies  ij  <  t&.  We  define  the  term  ts  to  be  the  term  (•  *  •  ((«,-,  +0^,)-! — a^). 
let  U  be  the  set  of  all  n  constants.  The  equivalence  class  of  the  term  tv  can  be 
generated  by  a  grammar  with  nonterminal  symbols  of  the  form  Xs  where  5  is  a 
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nonempty  subset  of  the  constant  symbols.  This  grammar  contains  all  productions 
of  the  form 

Xs  — ►  (Xw  +  Xv) 

where  W  and  V  are  disjoint  subsets  of  U  and  S  is  W  U  V.  The  grammar  also 
contains  all  productions  of  the  form 

—*  a,-. 

The  equivalence  class  of  tu  is  generated  by  the  nonterminal  Xu  of  this  grammar. 
More  generally,  the  nonterminal  X$  generates  the  equivalence  class  of  <5. 

The  grammar  that  generates  the  equivalence  class  of  tv  grows  exponentially  in 
size  of  U  —  the  grammar  contains  order  3*  productions.  This  fact  can  be  derived 
by  observing  that  each  production 

Xs  — *  (Xw  +  Xy) 

classifies  each  element  of  U  in  one  of  three  ways  —  either  as  a  member  of  Wy  or 
a  member  of  V,  or  as  a  member  of  neither.  There  are  3"  ways  of  classifying  n 
constants  into  three  groups.  There  are  actually  less  than  3"  productions  because 
the  sets  W  and  V  in  the  above  production  must  both  be  nonempty.  The  number 
of  productions  in  this  grammar  should  be  contrasted  with  the  number  of  terms  in 
the  equivalence  class.  For  n  as  8  the  grammar  contains  6,058  productions  while 
the  equivalence  class  contains  over  17  million  terms. 

With  no  term  ordering  the  grammar  rewriting  procedure  described  in  section  4 
requires  time  on  the  order  of  4*  to  construct  the  grammar  that  generates  the 
equivalence  class  of  tu  under  the  equational  theory  that  contains  the  associativity 
and  commutativity  axioms  for  + .  It  seems  that  this  may  be  acceptable  in  a  large 
number  of  rewriting  applications.  In  the  theorems  of  the  Boyer- Moore  corpus,  for 
example,  the  number  of  elements  of  an  AC  term  rarely  exceeds  three  [Boyer  and 
Moore,  1979].  For  n  =  5  we  have  that  4N  is  about  a  thousand,  a  small  number 
for  modern  computers.  When  a  nontrivial  term  ordering  is  imposed  the  grammar 
rewriting  process  generates  a  smaller  grammar  and  terminates  more  quickly. 


3  Congruence  Grammars  and  Ground  Equations 

A  congruence  grammar  is  a  representation  of  an  equivalence  relation  on  terms. 
It  turns  out  that  those  relations  which  can  be  represented  by  finite  congruence 
grammars  are  exactly  those  relations  which  are  the  deductive  closure  of  a  finite 
set  of  ground  equations.  This  section  states  some  basic  results  relating  congruence 
grammars  and  ground  equations. 
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First,  we  define  the  equivalence  relation  on  (all)  ground  terms  represented 
by  a  congruence  grammar.  This  is  done  by  defining  an  interning  operation  on 
ground  terms.  The  word  “interning”  is  commonly  used  to  describe  the  way  strings 
are  mapped  to  variables  in  most  programming  languages.  Here  we  are  mapping 
semantically  equivalent  terms  to  the  same  internal  data  structure.  We  assume 
a  given  congruence  grammar  and  define  an  interning  operation  such  that  for  any 
term  t,  the  result  of  interning  t,  denoted  /[<],  is  a  nonterminal  of  the  grammar  such 
that  I[t]  generates  t.  If  there  is  no  nonterminal  of  the  grammar  which  generates  t 
then  the  grammar  is  extended  with  new  productions  in  such  a  way  that  the  desired 
nonterminal  is  created. 

Procedure  for  Interning  a  term  t: 

•  Let  «i,  •  •  • ,  Sn  be  the  (possibly  empty)  sequence  of  immediate 
subterms  of  t,  i.e.,  the  terms  such  that  t  »/(«!,  •••,  *»)• 

•  Let  Yi,  •  •  • ,  Yn  be  the  result  of  recursively  interning  the  terms 

*i»  •••>  *n* 

•  Let  X  — *  f(Yu  •  •  • ,  Yn)  be  the  production  whose  right  hand  side 
is  f(Yi,  •  •  • ,  yn).  If  there  is  no  such  production  then  create  one 
with  a  new  nonterminal  X  and  add  it  to  the  grammar. 

•  Return  the  nonterminal  symbol  X. 

Since  no  two  productions  share  the  same  right  hand  side  the  productions  can 
be  stored  in  a  hash  table  indexed  by  their  right  hand  sides.  Throughout  thia 
section  we  assume  the  presence  of  such  a  hash  table  and  assume  that  hash  table 
operations  can  be  performed  in  unit  time.  Under  these  assumptions  the  above 
intern  procedure  runs  in  linear  time  in  the  size  of  the  given  term. 

For  any  fixed  term  t  and  any  sequence  of  interning  operations  the  nonterminal 
symbol  I[t]  that  results  from  interning  t  will  be  the  same  for  each  repeated  interning 
of  t.  This  allows  one  to  think  of  the  interning  operation  as  a  well  defined  fopc*ion 
from  terms  to  nonterminal  symbols  determined  by  the  initial  congruence  grammar. 
In  fact,  the  initial  congruence  grammar  determines  an  equivalence  relation  on  terms 
such  that  two  terms  s  and  t  are  considered  to  be  equivalent  if  7[s)  equals  J[i].  For 
example,  suppose  the  initial  grammar  contains  the  two  productions  X  — »  f(X) 
and  X  —*  a  and  let  /"(a)  be  an  abbreviation  for  the  term  /(/(•  •  •  /(a)))  with  n 
occurrences  of  /.  In  this  case  we  have  that  /[/"(a)]  equals  X  for  any  term  of 
the  form  /"(a).  This  implies  that  /[^(/"(a))]  will  be  equal  to  /(p(/m(a))]  for  any 
nonnegative  integers  n  and  m.  Computing  /[^(/"(a))]  gives  a  nonterminal  symbol 
Y  such  that  the  grammar  has  been  extended  to  include  the  production  Y  -*  g(X). 
Although  interning  can  add  productions  to  the  grammar,  it  does  not  change  the 
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equivalence  relation  on  terms  defined  by  the  grammar.  Different  grammars  define 
different  intern  mappings  and  we  will  write  iio[s]  to  mean  the  result  of  interning 
the  term  s  with  respect  to  the  grammar  G. 

Theorem:  For  any  finite  set  H  of  equations  between  ground  terms 
there  exists  a  finite  congruence  grammar  Gr{H)  such  that  that  for  any 
two  ground  terms  a  and  t  we  have  that  H  ^  s  =  t  if  and  only  if  /cr  (#)[«] 
equals  /or (»)[*]•  Furthermore,  assuming  that  hash  table  operations  can 
be  done  in  unit  time,  the  grammar  Gr(H)  can  be  computed  from  H  in 
n  log  n  time  where  n  is  the  written  length  of  H. 

The  proof  of  this  theorem  is  based  on  well  known  algorithms  for  congruence  closure 
(Kozen,  1977],  [Shostack,  1978],  [Nelson  and  Oppen,  1980],  [Downey  et  a/.,  1980].  A 
procedure  for  incrementally  incorporating  new  equations  directly  into  a  congruence 
grammar  is  given  in  section  7.  A  nonoptimal  algorithm  for  converting  a  set  of 
equations  H  to  a  congruence  grammar  Gr(H)  can  be  described  as  follows.  For 
each  term  s  occuring  in  H  we  introduce  a  nonterminal  symbol  X*.1  For  each 
constant  symbol  a  appearing  in  H  we  construct  the  production  X,  — +  a.  For 
each  term  /( toi,  •••  to*)  occuring  in  H  we  add  the  production  X/^, ._ Vk)  — ♦ 
f(Xm ,  •  •  • ,  Xw,).  Now  we  process  the  equations  in  H  while  maintaining  a  union- 
find  structure  on  nonterminal  symbols.  For  each  equation  s  =  to  in  H  we  call 
union  on  the  nonterminals  X,  and  Xw.  After  processing  all  equations  we  select  & 
canonical  representative  from  each  equivalence  class  of  nonterminal  symbols  and 
replace  every  nonterminal  in  the  grammar  by  its  canonical  representative.  The 
resulting  grammar  need  not  be  a  congruence  grammar  because  it  is  now  possible 
that  two  distinct  productions  have  the  same  right  hand  side.  Any  time  we  have 
two  productions  X  — »  f(Wx,  •••  Wn)  and  Y  — ►  f{Wx,  •••  Wn)  with  the  same 
right  hand  aide  we  uniformly  replace  all  occurances  of  X  in  the  grammar  by  Y. 

Such  replacement  is  continued  until  we  have  a  congruence  grammar. 

/ 

The  transformation  from  equations  to  grammars  has  an  inverse  —  one  can 
transform  a  grammar  back  into  a  set  of  equations.  For  each  nonterminal  symbol 
X  we  construct  a  new  constant  symbol  cx-  The  constants  of  the  form  cx  will  be 
called  internal  constants  to  distinguish  them  from  the  other  (external)  constants. 
A  term  that  does  not  contain  any  of  these  internal  constants  will  be  called  an 
external  term.  We  now  have  the  following  definition  and  lemma. 

Definition:  If  G  is  a  congruence  grammar  we  define  Et(G)  to  be  a  set 
of  equations  of  the  form  cx  =  /(cy, ,  •  •  • ,  cy„)  where  G  contains  the 
production  X  — ►  f(YXi  •  •  • ,  Yn). 

*(We  say  that  i  occurs  in  H  if  $  is  either  one  side  of  an  equation  in  H  or  $  is  a  subtenn  of  a 
teem  in  an  equation  in  H. 
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For  example,  if  G  contains  the  two  productions  X  — ►  f(X)  and  X  a,  then 
E9(G)  contains  the  two  equations  cx  =  f(<*x)  and  cx  =  a. 

Lemma:  For  any  congruence  grammar  G,  and  any  two  external  ground 
terms  a  and  f,  we  have  that  Iq[s\  equals  /<?[*]  if  and  only  if  E9(G)  (= 
a  =  t. 

Proof  Sketch:  One  can  show  by  induction  on  the  size  of  a  term  a  that 
E9(G)  a  =  cia[a].  Then  if  /©[s]  =  /©[<]  =  X  we  have  E9(G)  a  =  cx 
and  E9(G)  |=  t  =  cx  so  E9(G)  |s=  a  =  t.  To  show  the  converse  we  add 
a  production  X  —+  cx  for  each  nonterminal  X.  These  productions 
do  not  alter  the  intern  function  on  external  terms.  We  then  consider 
the  congruence  relation  (on  both  internal  and  external  terms)  defined 
by  the  intern  function  for  this  extended  grammar.  This  congruence 
relation  provides  a  semantic  model  of  E9(G).  So  if  /©[a]  ^  Ia[t]  then 

The  operation  E9  introduces  internal  constants  into  the  equation  set.  These  in¬ 
ternal  constants  are  irrelevant  to  the  equivalence  relation  on  external  terms  defined 
by  the  equation  set.  One  can  define  the  operation  Gr  so  that  it  eliminates  inter¬ 
nal  constants  from  the  grammar.  The  equation  sets  constructed  by  E9  are  useful 
for  analysis  and  conceptual  definitions  but  they  axe  never  actually  computed.  All 
computation  is  done  directly  on  grammars. 


4  Grammar  Rewriting 

This  section  defines  the  basic  concepts  of  grammar  rewriting. 

Definition:  A  grammar  rewriting  system  is  a  pair  <E,  w>  where 
E  it  a  set  of  equations  between  first  order  terms  (usually  containing 
variables)  and  w  is  a  weight  function  which  assigns  a  positive  integer 
to  every  ground  term. 

The  weight  function  induces  a  well  founded  order  on  ground  terms  by  setting 
s  <  t  if  and  only  if  u>(s]  <  u>[f].  Unlike  the  orderings  used  in  traditional  ordered 
rewriting  systems,  e.g.,  [Martin  and  Nipkow,  1990],  the  weight  orderings  used 
in  grammar  rewriting  are  not  total  on  ground  terms.  In  term  rewriting  one  is 
interesting  in  simplifying  one  particular  term.  Grammar  rewriting  is  also  focused 
on  a  particular  term.  However,  in  grammar  rewriting  this  term  is  represented  by 
a  nonterminal  symbol  of  a  grammar. 
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Definition:  We  define  the  "one  step”  rewrite  relation  *-*B  on  ground 
terms  so  that  a\s\  t-»E  a\t]  provided  either  s  =  t£E  or  t  =  sEE, 
every  free  variable  of  t  appears  in  s,  and  a  is  a  ground  substitution. 

For  example,  if  E  is  the  set  {g(xt  /(*))  =  c}  then  we  have  that  g(h(a),  f(h(a))) 
c  but  we  do  not  have  c  *-*E  g(h(a),  f(h(a))). 

Definition:  A  grammar  term  is  a  pair  <X,  G>  where  G  is  a  finite 
congruence  grammar  and  AT  is  a  nonterminal  that  appears  in  G. 

Definition:  We  say  that  a  term  s  is  a  minimal  representative  of  a 
grammar  term  <X,  G>,  with  respect  to  a  weight  function  to,  if  a  is 
generated  by  X  under  G  and  no  other  term  generated  by  X  is  smaller 
than  s  according  to  w. 

Definition:  We  define  the  one  step  rewrite  relation  on  gram¬ 

mar  terms  so  that 


<X,  G>  uo  <X\  Gr(Eq(G)  U  {u  =  v})> 

provided  u  is  a  subterm  of  some  minimal  representative  of  <X,  G> 
under  the  weight  function  to,  u  >-*e  v,  JG[ti]  ji  and  X'  is  the 

nonterminal  of  Gr(E,(G)u{u  =  w})  that  generates  the  terms  generated 
by  X  under  G. 

The  one-step  grammar  rewriting  operation  defined  above  corresponds  to  se¬ 
lecting  a  term  minimally  generated  by  X ,  rewriting  that  term  according  to  some 
equation  in  E,  and  then  modifying  the  grammar  so  that  the  result  of  the  rewrite 
is  included  in  the  language  generated  by  X.  A  procedure  for  efficiently  computing 
Gr(E,(G)  U  {to  =  u})  from  G  and  the  equation  to  =  u  is  given  in  section  7.  This 
procedure  does  not  construct  an  equation  set  —  it  performs  a  direct  operation  on 
the  grammar. 

The  definition  of  the  rewrite  relation  on  grammars  requires  that  the  new  equar 
tion  being  incorporated  into  the  grammar  is  indeed  new,  i.<  can  not  be  an 
equation  that  is  already  implied  by  E„(G).  This  allows  for  the  existence  of  normal 
forms  as  defined  below. 

Definition:  We  say  that  a  grammar  term  <X,  G>  is  in  normal  form 
if  there  is  no  grammar  term  <X',  C>  such  that  <X,  G> 

<X',  <7>. 
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As  an  example  of  rewriting,  let  C  be  the  equation  set  consisting  of  the  single 
commutative  law  x+y  =  y+x.  Let  1  be  the  weight  function  that  assigns  every  term 
the  weight  1  (this  weight  function  corresponds  to  the  empty  ordering  on  terms). 
Let  G  be  the  grammar  consisting  of  the  productions  X\  — *  at  +X?,  X*  — »  a2 + X3, 

•  •  •,  X„-i  — ►  On_i  +  X„,  Xn  —*  On.  In  other  words,  G  is  the  grammar  generated 
by  interning  the  term  aj  +  (a3  +  (•••  +  an))  starting  with  the  empty  grammar.  In 
the  grammar  rewriting  system  <C,  1>  we  have  that  the  grammar  term  <Xit  G> 
rewrites  to  a  normal  form  <X\,  C>  where  G'  is  the  grammar  consisting  of  the 
productions  of  the  form 

Xi  — ►  Oi  +  Xi+i 

Xi  — »  Xi+i  +  a, 

together  with  the  production 

Xn  — »  On. 

A  similar  example  can  be  given  for  the  equation  set  AC  consisting  of  the  associative 
and  commutative  laws. 

Certain  restrictions  on  the  weight  function  tv  can  be  used  to  ensure  that  grammar 
rewriting  always  terminates. 

Definition:  A  grammar  rewriting  system  <E,  w>  is  called  terminat¬ 
ing  if  there  are  no  infinite  rewrite  chains  of  the  form  <Xi,  Gi>  *j> 

<X?,  Gj>  <X3,  G 3>  *~*<E,uO  •'  •. 

Definition:  A  weight  function  w  will  be  called  a  polynomial  weight 
Junction  if  for  each  function  symbol  f  of  n  arguments  there  exists  a 
polynomial  Pj(x j,  •  •  • ,  xn)  in  n  variables  with  coefficients  greater  than 
or  equal  to  1,  where  each  z<  appears  in  at  least  one  term,  where  there 
is  a  constant  term  of  at  least  1,  and  such  that  for  any  ground  terms  sj, 

•  •  •,  sn  we  have  w[f(si,  •  •  • ,  s*)]  =  •  •  • ,  tn[snJ). 

Orderings  based  on  polynomial  weight  functions  are  well  known  in  the  term  rewrit' 
ing  literature. 

Lemma:  For  a  given  finite  set  of  function  and  constant  symbols,  and 
a  given  weight  k ,  there  are  only  finitely  many  terms  that  can  be  con¬ 
structed  from  those  symbols  that  have  weight  less  than  or  equal  to 
k. 

Because  grammar  rewriting  does  not  introduce  new  constant  or  function  sym¬ 
bols,  and  because  matching  is  restricted  to  minimal  weight  terms,  we  have  the 
following  well  foundedness  lemma  for  grammar  rewriting. 
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Lemma:  If  w  is  a  polynomial  weight  function  and  E  is  any  finite  set 
of  equations  then  <E,  w>  is  terminating. 

We  use  to  denote  the  reflexive  transitive  closure  of  the  relation  *-+<£,*>• 

Definition:  Let  ib[s]  be  the  grammar  term  that  results  from  intern¬ 
ing  a  relative  to  the  empty  grammar.  If  70[s]  <X,  G>  and 

<X,  G>  is  in  normal  form,  then  <X,  G>  is  called  a  normal  form  of 
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Definition:  For  any  ground  term  a  and  set  of  equations  E  we  define 
Me  to  be  the  set  of  terms  that  can  be  proven  to  be  equal  to  a  using 
the  equations  in  E. 

Definition:  A  grammar  rewriting  system  <E,  w>  is  called  complete 
if  for  each  ground  term  s,  and  any  normal  form  <X,  G>  of  a,  every 
minimal  member  of  Me  “  generated  by  X  under  G. 

Definition:  A  grammar  rewriting  system  is  called  canonical  if  it  is 
terminating  and  complete. 

A  canonical  grammar  rewriting  system  <E,  w>  provides  a  decision  procedure  for 
the  equational  theory  E. 

Theorem:  Let  <E,  w>  be  a  canonical  grammar  rewriting  system 
and  let  a  and  t  be  any  two  ground  terms.  Let  <X,  G>  and  <X',  G*> 
be  normal  forms  for  a  and  t  respectively  under  the  rewriting  system 
<E,  w>.  Let  G"  be  the  grammar  that  encodes  all  equivalences  encoded 
in  G  or  G\  i.e.,  G"  is  Gr(Eq(G)  U  Eq(G')).  We  have  that  E  |=  a  =  t  if 
and  only  if  7<?«M  =  7o„[fj. 

This  lemma  follows  from  the  invariant  that  for  any  ground  term  u  if  7o[u]  ^ 

<X,  G>  then  X  generates  u  under  G  and  if  <X,  G>  is  a  normal  form  of  u  then 
X  generates  all  minimal  elements  of  |u|e- 

5  Examples 

We  let  w  be  the  simple  polynomial  weight  function  such  that  for  every  term 
f(»u  ",  «„)  w«  tave  that  to[/(«i,  ",  *n)]  *  to[*i]  +  •••  +  to(sn]  +  1.  Let 
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A  and  C  be  the  two  equation  sets  {x  +  (y  +  z)  =  (x  +  y)  +  *}  and  {x  +  y  =  y  +  z) 
respectively.  Let  AC  be  A  U  C.  One  can  easily  verify  that  the  systems  A,  C,  and 
AC  are  all  canonical  under  this  ordering. 

Let  w  be  any  polynomial  weight  function  satisfying  tn[s  +  f]  =  2u?[s]  +  to[<]  + 1. 
Under  this  weight  function  we  have  that  to[s  + t]  <  to[<  +  s]  provided  to[s]  <  u>[t] 
and  we  have  tn[s  +  (i  +  tt)]  <  u>[(s  +  f)  +  u]  for  any  terms  s,  t,  and  u.  The  equation 
set  AC  U  {x  +  (y  +  z)  =  y  +  (x  +  z)}  forms  a  canonical  system  where  every  term 
normalizes  to  a  grammar  term  whose  minimal  representatives  are  the  weight  sorted 
permutations  of  the  addends  under  a  standard  parenthesization.  If  the  addends 
are  linearly  ordered  by  weight  then  there  is  only  one  minimal  representative. 

We  now  consider  Abelian  groups.  Let  w  be  any  order  satisfying  to[s+t]  =  t0[a]+ 
u>[t]  +  1  and  tu[— s]  =  2u>[a].  Under  this  ordering  we  have  that  t»[(— s)  +  (-<)]  < 
u>[— (a+<)].  The  equation  set  AC  U{— (x+y)  =  (— x)+(— y),  x+(— x)  =  0,  x+0  = 
xj  is  a  canonical  system  under  this  ordering.  Every  term  normalizes  to  a  grammar 
whose  minimal  representatives  form  an  AC  equivalence  class.  Refinements  of  the 
ordering  can  give  canonical  systems  which  generate  smaller  grammars. 

There  seems  to  be  little  difficulty  in  handling  the  well  known  theories  of  ACI, 
groups  of  exponent  2,  and  Boolean  rings  where  each  theory  can  be  handled  with 
different  weight  functions  that  correspond  to  different  minimal  term  sets. 

Let  E  be  AC  U  {/(x  +  x)  =  1}.  This  is  given  in  [Martin  and  Nipkow,  1990] 
as  an  example  of  a  theory  that  can  not  be  handled  by  ordered  term  rewriting 
without  special  unification  procedures.  However,  this  equation  set  is  canonical 
under  grammar  rewriting  using  the  simple  ordering  given  in  the  first  example  of 
this  section. 


6  Locally  Context  Free  Theories 

Let  B  (for  Band)  be  the  equation  set  {x*(y  *x)  —  ( x*y)*z ,  x*x  =  x}.  It  has  been 
shown  that  no  finite  term  rewriting  or  word  rewriting  system  can  be  canonical  for  B 
[Baader,  1990].  However,  it  is  possible  to  show  that  B  itself  is  a  canonical  grammar 
rewriting  system  relative  to  the  empty  term  ordering  (and  under  fair  rewriting 
as  defined  below).  The  proof  of  this  result  can  be  generalized  and  provides  an 
interesting  class  of  canonical  grammar  rewriting  systems.  Throughout  this  section 
we  use  the  weight  function  1  that  assigns  the  weight  1  to  all  terms.  This  induces  the 
empty  ordering  on  terms.  We  start  with  some  simple  definitions  and  observations. 

Throughout  this  section  we  let  £  a  fixed  but  arbitrary  finite  set  of  equations. 

Definition:  E  is  called  fully  bidirectional  if  for  each  equation  s  =  t  in 
E  the  terms  s  and  t  contain  the  same  set  of  variables. 
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Recall  that  a  grammar  rewriting  system  <E,  w>  is  complete  if  any  normal  form 
of  a  term  s  represents  all  minimal  elements  of  |a|jg. 

Lemma:  If  E  is  fully  bidirectional  then  <E,  1>  is  complete. 

If  E  is  fully  bidirectional  and  all  terms  are  the  same  weight  then  grammar 
rewriting  starting  with  a  term  s  corresponds  to  unrestricted  enumeration  of  the 
equivalence  class  of  s.  If  this  process  leads  to  a  normal  form  then  that  normal  form 
must  be  a  grammar  that  generates  the  entire  equivalence  class  of  s.  Of  course, 
for  many  equation  sets  E  the  system  <E,  1>  fails  to  produce  any  normal  forms 
and  the  above  lemma  is  vacuously  true.  However,  many  equation  sets  do  generate 
normal  forms  under  the  empty  term  ordering. 

Definition:  An  equation  set  £  is  called  finitary  if  for  every  ground 
term  s  the  set  |s|£  is  finite. 

Lemma:  If  E  is  fully  bidirectional  and  finitary  then  <E,  1>  is  canon¬ 
ical 

This  follows  directly  form  the  fact  that  if  E  is  finitary  then  the  enumeration 
of  the  equivalence  class  of  a  term  must  terminate.  The  above  lemma  immedi¬ 
ately  implies  that  <A,  1>,  <C,  1>  and  <AC,  1>  are  all  canonical.  The  next  few 
definitions  and  lemmas  give  a  more  interesting  class  of  canonical  systems. 

Definition:  An  equation  set  E  is  called  locally  context  free  if,  for  every 
ground  term  s,  the  set  |s|£  can  be  generated  by  a  finite  congruence 
grammar. 


Note  that  if  E  is  locally  context  free  then  each  equivalence  class  under  E  is 
a  context  free  language  in  the  traditional  sense.  If  E  is  locally  context  free  then 
normal  forms  exist.  This  does  not  imply,  however,  that  the  grammar  rewriting 
system  is  terminating.  Let  E  contain  the  three  equations  f(x)  =  x,  g(x)  =  x, 
and  /(x)  =  f(g(x)).  E  is  locally  context  free.  However,  by  repeatedly  selecting 
only  the  last  equation  it  is  possible  for  a  grammar  rewrite  system  (under  the 
empty  term  ordering)  to  run  forever.  We  can  rule  out  this  kind  of  nontermination 
by  considering  only  fair  rewriting  schemes.  Intuitively,  a  rewrite  system  is  fair 
provided  that  no  equation  is  ignored  indefinitely. 

Definition:  An  infinite  chain  <Xi,  G\>  •-►<£,  «o  <Aj,  G%>  •-»<£,*> 

<Xs,  Gi>  »-»<&,  «e>  •  •  •  is  said  to  be  fair  if  for  any  subterm  u  of  a 
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minimal  representative  of  a  grammar  term  <Xi,  G,->,  if  u  •— v  then 
there  exists  some  grammar  term  <Xj,  Gf>  with  j  >  i  such  that  either 
u  is  not  a  subterm  of  minimal  representative  of  <Xj,  Gj>  or  Ig}  [u]  = 
kjlv). 

Definition:  A  grammar  rewriting  system  <E,  xv>  is  said  to  termi¬ 
nate  under  fair  rewriting  if  there  is  no  fair  infinite  rewriting  chain 
for  <E,  w>.  The  system  <E,  w>  will  be  called  canonical  under  fair 
rewriting  if  it  is  complete  and  terminates  under  fair  rewriting. 

Lemma:  If  E  is  fully  bidirectional  and  locally  context  free  then  <E ,  1> 
is  canonical  under  fair  rewriting. 

If  E  is  fully  bidirectional  and  all  terms  have  the  same  weight  then  grammar 
rewriting  of  a  term  s  corresponds  to  the  enumeration  of  the  entire  equivalence  class 
of  s.  If  this  enumeration  is  done  in  a  fair  manner  then  we  eventually  generate  every 
equation  of  the  form  u  =  v  that  is  provable  from  E  where  u  and  v  are  subterms 
of  terms  equivalent  to  a.  If  s  is  generated  by  a  finite  grammar  then  this  grammar 
can  be  written  as  Gr(H)  where  H  is  some  finite  set  of  equations  of  this  form.  So 
the  rewrite  process  must  eventually  construct  this  final  grammar  and  terminate. 

To  show  that  the  equation  set  B  is  canonical  it  now  suffices  to  show  that  it  is 
locally  context  free.  The  following  proof  is  due  largely  to  Robert  Givan  and  Carl 
Witty  of  the  MIT  AI  Laboratory. 

Definition:  An  equation  set  E  is  said  to  be  locally  finite  if  for  any  finite 
set  S  of  constant  symbols  the  set  of  all  terms  that  can  be  constructed 
from  the  constants  in  S  and  the  functions  in  E  fall  into  a  finite  number 
of  equivalence  classes  under  E. 

For  example,  the  equation  set  AC  U{x+x  =  x}  has  the  property  that  for  any  fixed 
set  of  n  constants,  every  sum  that  can  be  constructed  from  those  n  constants  fall 
into  2"  —  1  equivalence  classes  —  one  equivalence  class  for  each  nonempty  subset 
of  the  constants. 

It  is  interesting  to  note  that  E  is  locally  finite  if  and  only  if  for  any  finite  set  S 
the  free  .E-algebra  generated  by  5  is  finite. 

Lemma:  If  E  is  locally  finite  and  fully  bidirectional  then  E  is  locally 
context  free. 
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Consider  &  term  a  and  consider  a  term  t  in  |s|£.  If  E  is  fully  bidirectional 
then  every  constant  that  appears  in  t  must  appear  in  either  a  or  E.  Since  E  is 
locally  finite  there  are  only  a  finite  number  of  equivalence  classes  of  such  terms.  By 
creating  a  nonterminal  symbol  for  each  equivalence  class  one  can  show  construct  a 
grammar  for  generating  the  equivalence  class  of  a.  Given  that  a  grammar  exists,  it 
can  be  computed  using  the  fact  that  <E,  1  »  is  canonical  whenever  E  is  locally 
context  free. 

The  converse  of  the  above  lemma  does  not  hold.  There  are  locally  context  free 
equation  sets  which  are  not  locally  finite.  The  empty  equation  set  is  an  example. 

One  can  use  the  infinite  canonical  word  rewrite  system  for  B  given  in  [Siekmann 
and  Szabo,  1982]  (also  described  in  [Baader,  1990])  to  prove  that  B  is  locally  finite 
—  for  any  set  of  n  constants  there  exists  a  k  such  that  all  words  built  the  given 
constants  can  be  simplified  to  a  word  no  longer  than  k.  The  above  lemma  now 
implies  that  <B,  1>  is  canonical  under  fair  rewriting. 


7  Incorporating  Equations  into  Grammars 

This  section  gives  an  algorithm  for  computing  Gr(Et(G)  U  {a  =  *})  from  the 
grammar  G  and  the  equation  a  —  t.  The  grammar  GT{Et{G)  U  {s  =  <})  is  directly 
computed  from  G  by  incrementally  adding  and  removing  productions.  The  algo¬ 
rithm  given  here  is  quite  similar  to  the  congruence  closure  procedure  described  in 
[Nelson  and  Oppen,  1980].  The  algorithm  has  been  reformulated  here  to  operate 
on  grammars  and  optimized  to  run  in  order  nlogn  time  (under  the  assumption 
that  hash  table  operations  take  unit  time).2 

This  procedure  uses  the  internal  constants  of  the  form  eg  that  are  associated 
with  the  nonterminals  of  the  grammar.  The  procedure  maintains  an  equivalence 
relation  represented  by  three  sets  of  equations.  First,  the  procedure  maintains 
a  congruence  grammar.  Second,  the  procedure  maintains  a  queue  of  equations 
of  the  form  cz  —  cy.  Third,  an  additional  set  of  equations  between  constants 
of  the  form  cz  is  maintained  in  a  union-find  structure  on  these  constants.  The 
equivalence  relation  determined  by  these  three  sets  of  equations  is  maintained  as 
a  fixed  invariant  of  the  procedure. 

As  mentioned  above,  equations  between  constant  symbols  can  be  handled  by 
the  well  known  union-find  procedure.3  Consider  a  set  of  equations  <»i  =  6j,  a3  = 
&a»  *  *  *  *  <*»  *  4*.  This  set  of  equations  can  be  represented  in  a  union-find  structure 

3A  somewhat  different  nlogn  algorithm  for  congruence  closure  is  described  in  [Downey  ei  aL, 
1980]. 

*A  description  of  the  union-find  protocol  and  efficient  union- find  algorithms  <•«"  be  found  in 
most  modem  algorithms  texts. 
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by  executing  union^,  6j),  union(a3,  63),  •  •  •,  union(a„,  bn).  To  see  if  an  equation 
c  as  d  follows  from  the  given  equations  one  computes  f  ind(c)  and  f  ind(d).  The 
equation  c  =  d  is  provable  if  and  only  if  these  two  find  operations  return  the  same 
value. 

In  the  following  procedure  we  assume  that  the  find  operation  is  such  that 
find(cx)  is  a  canonical  member  of  the  equivalence  class  of  c*  relative  to  the 
equivalence  relation  encoded  in  the  union-find  structure.  In  other  words,  f  ind(c*) 
is  a  constant  cy  such  that  find(cy)  equals  cy.  We  also  assume  that  when  two 
equivalence  classes  are  merged  with  a  union  operation  the  canonical  representative 
of  the  resulting  equivalence  class  is  selected  to  be  one  of  the  two  previous  canonical 
representatives. 

Definition:  A  constant  c*  is  said  to  be  dead  if  f  ind(cx)  is  some 
constant  other  than  cx-  A  constant  that  is  not  dead  is  said  to  be  alive, 

i.e.,  a  constant  cy  is  alive  if  find(cy)  equals  cy. 

Procedure  for  computing  Gr(Eq(G)  U  {a  =  t}): 

1.  Let  Z  and  W  be  the  nonterminals  Jq[s]  and  /(?[<]. 

2.  Initialize  5  to  be  a  queue  containing  the  single  equation  cz  =  cw- 

3.  While  the  queue  S’  is  not  empty  do  the  following. 

(a)  Remove  an  equation  cz  =  Cw  from  the  queue  S. 

(b)  Let  cx  be  f  ind(cz)  and  let  cy  be  f  ind(cw). 

(c)  If  cx  is  the  same  symbol  as  Cy  then  do  nothing,  otherwise: 

(d)  Call  union(cx,cy). 

(e)  Swap  the  roles  of  X  and  Y  if  necessary  so  that  cx  is  dead  and 
cy  is  still  alive. 

(f)  Let  V  be  the  set  of  all  productions  involving  X. 

(g)  Remove  all  the  productions  in  'P  from  the  grammar. 

(h)  For  each  production  Z  -*  f(Wu  •  •  •  Wn)  in  V  do  the  follow¬ 
ing. 

i.  Let  cz>  be  f  ind(c^)  and  for  each  Wi  let  cw‘  be  f ind(cnrj) 

ii.  If  there  is  no  production  whose  right  hand  side  is  f(W{,  •••,  W^) 
then  add  the  production  Z’  — »  f{W[,  •  •  • ,  W'n). 

iii.  If  there  is  already  a  production  U  —*  f(W{ ,  •  •  • ,  W„) 
where  U  is  different  from  Z*  then  add  the  equation  cu  = 
cz>  to  the  queue  S. 
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This  procedure  maintains  the  invariant  that  no  two  productions  in  the  grammar 
have  the  same  right  hand  side,  i.e.,  the  grammar  is  always  a  congruence  grammar. 
Furthermore,  the  equivalence  relation  encoded  in  the  equations  in  the  grammar, 
the  queue,  and  the  union-find  procedure  is  maintained  as  a  fixed  invariant.  To 
check  this  one  must  check  that  every  added  equation  is  derivable  from  previous 
equations  and  that  every  equation  removed  in  step  g  is  derivable  from  previous 
equations  plus  those  equations  added  in  step  h.  The  procedure  also  maintains  the 
invariant  that  every  nonterminal  in  every  production  is  “alive" ,  i.e.,  if  X  appears 
in  a  production  of  the  grammar  then  find(cx)  equals  c*.  Every  nontrivial  ex¬ 
ecution  of  the  main  loop  reduces  the  number  of  living  nonterminal  symbols,  so 
the  procedure  must  terminate.  Furthermore,  when  the  procedure  terminates  the 
equational  theory  enoded  in  the  union-find  structure  can  be  dropped  without  al¬ 
tering  the  induced  equivalence  relation  on  external  terms.  To  prove  this  consider 
an  extended  grammar  that  includes  all  productions  of  the  form  X  — ►  cy  where 
X  is  a  living  nonterminal  and  i ind(cy)  is  cx.  The  equation  set  of  the  extended 
grammar  encodes  the  equivalence  relation  of  the  original  grammar  plus  the  equiv¬ 
alences  in  the  union  find  structure.  However,  this  extended  grammar  encodes  the 
same  intern  function  on  external  terms  as  the  unextended  grammar. 

If  we  assume  a  bound  on  the  number  of  arguments  taken  by  function  symbols, 
e.g.,  no  function  takes  more  than  three  arguments,  and  assume  that  hash  table 
operations  can  be  performed  in  unit  time,  then  under  an  appropriate  implemen¬ 
tation  of  union-find  it  can  be  shown  that  the  above  procedure  terminates  in  order 
nlogn  time  where  n  is  the  number  of  productions  in  the  original  grammar.  In 
practice  the  incorporation  of  a  single  new  equation  into  a  large  grammar  requires 
the  manipulation  of  only  a  small  subset  of  the  grammar. 


8  A  Grammar  Rewriting  Algorithm 

In  this  section  we  consider  a  fixed  but  arbitrary  grammar  rewriting  system  <E,  to>, 
where  to  is  a  polynomial  weight  function,  and  give  a  procedure  for  computing  all 
the  ways  in  which  a  given  grammar  term  can  be  rewritten  under  <E,  w>.  The 
procedure  is  incremental  so  that  if  <X,  G>  <X',  G?>  then  the  set  of 

possible  ways  of  rewriting  <X\  C>  can  be  computed  incrementally  from  the  set 
of  possible  ways  of  rewriting  <X,  G>.  Incremental  procedures  can  be  defined  by 
“inference  rules"  that  are  run  in  an  incremental  forward  chaining  manner.  We  first 
give  rules  for  deriving  the  weight  of  nonterminal  symbols  as  defined  below. 

Definition:  For  each  nonterminal  Y  of  a  congruence  grammar  the 
weight  of  Y,  denoted  te[y],  is  minimum  weight  of  all  terms  generated 
by  K 
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The  following  inference  rule  can  be  used  to  propagate  bounds  on  weights. 

io[Xi]  <  t a 

• 

to[X»]  <  wn 

Z-f{X „  •,*») 


to[Z]  <  P/(tox,  •••, 

For  constant  symbols  (functions  of  no  arguments)  the  above  inference  rule  can 
be  used  to  generate  a  weight  bound  directly  from  a  production  of  the  form  Z  —*  c. 
Weight  bounds  generated  by  constants  can  then  be  propagated  to  other  symbols. 
One  can  show  that,  for  any  given  nonterminal  Y,  the  tightest  bound  that  can  be 
derived  for  to[K]  using  the  above  rule  is  in  fact  the  weight  of  Y  as  defined  above. 
In  practice,  simply  running  this  rule  over  the  entire  grammar  until  the  tightest 
bounds  are  derived  seems  to  be  an  acceptable  incremental  algorithm  for  computing 
the  weight  of  nonterminals.  However,  the  theoretical  worst  case  behavior  of  this 
this  algorithm  is  quite  bad.  An  n  log  n  algorithm  can  be  derived  by  placing  derived 
bounds  on  a  priority  queue  and  processing  the  tightest  bounds  first. 

Procedure  to  compute  the  weight  of  all  nonterminals: 

1.  Initialize  S  to  be  the  priority  queue  containing  all  pairs  of  the 
form  <Y,w>  where  the  grammar  contains  the  production  Y  — ►  a 
where  a  is  a  constant  and  w  is  the  weight  of  a. 

2.  Until  S  is  empty,  or  until  every  nonterminal  has  been  assigned  a 
weight,  do  the  following. 

(a)  Remove  a  pair  <Y,  w>  from  S  such  that  to  is  the  minimum 
weight  of  all  pairs  on  S. 

(b)  If  Y  has  already  been  assigned  a  weight  do  nothing.  Other¬ 
wise: 

(c)  Assign  Y  the  weight  to. 

(d)  For  each  production  Z  — »  f(Wu  •  Wn)  such  that  some  W, 
is  Y  and  such  that  each  Wi  has  been  assigned  a  weight  to,, 
add  the  pair  <Z,  P/[wi,  •  •  • ,  to„)>  to  the  queue  5. 

Each  pair  added  to  the  queue  has  a  larger  weight  than  the  last  pair  removed 
from  the  queue.  This  implies  that  the  pairs  removed  from  the  queue  have  mono- 
toni cally  increasing  weight.  This,  plus  the  assumed  properties  of  the  polynomial 
weights,  implies  that  for  each  nonterminal  symbol  Y  the  weight  assigned  to  Y  is 
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the  minimum  over  all  the  productions  from  Y  of  the  weight  computed  from  that 
production.  One  can  check  that  by  selecting  productions  that  minimize  weight 
that  if  Y  has  been  assigned  weight  to  then  Y  generates  a  term  of  weight  to.  Fur¬ 
thermore,  one  can  prove  by  induction  on  the  weight  of  a  term  s  that  the  weight  of 
s  is  at  least  as  large  as  the  weight  assigned  to  /[s],  the  symbol  that  generates  s. 
Since  the  procedure  only  assigns  a  single  weight  to  each  nonterminal  symbol  the 
number  of  pairs  placed  on  the  priority  queue  is  linear  in  the  size  of  the  grammar. 
Order  n  insertions  and  deletions  from  a  priority  queue  can  be  done  in  n  log  n  time. 
Assuming  a  bound  on  the  number  of  arguments  taken  by  any  function  symbol,  the 
other  operations  in  this  procedure  be  performed  in  order  n  time  so  the  total 
running  time  is  order  n  log  n. 

The  next  step  is  to  identify  those  nonterminals  in  the  grammar  that  generate 
subterms  of  minimal  representatives  of  <X,  G>. 

Definition:  A  production  Z  — »  f(X i,  •  •  • ,  Xn)  is  minimal  if  u?[Z]  = 
Pj(*>[Xi],  -  w[Xn]). 


Lemma:  If  s  is  a  minimal  representative  of  the  grammar  term  <X,  G> 
then  s  is  generated  by  X  in  that  subset  of  G  which  consists  of  just  the 
minimal  productions  of  G. 

Definition:  We  say  that  a  nonterminal  symbol  Y  is  a  minimal  subterm 
nonterminal  of  a  grammar  term  <X,  G>  if  either  Y  is  X  {X  is  a 
minimal  subterm  nonterminal)  or  G  contains  a  minimal  production 
W  — ¥  f(Zi,  •  •  • ,  Zn)  where  W  is  a  minimal  subterm  nonterminal  and 
some  Zi  is  Y. 

As  an  example  consider  the  grammar  term  <X,  G>  where  G  consists  of  the 
five  productions 

X-f(Y),  Y-S(Z),  Z->* 

Z  ->  h(W),  W->f(Z). 

The  production  Z  — ♦  h(W)  is  not  minimal  —  it  does  not  provide  a  smallest 
term  generated  by  Z.  However,  all  other  productions  are  minimal,  including  the 
production  W  — »  f(Z)  which  provides  a  smallest  term  generated  by  W.  However, 
the  nonterminal  W  is  not  a  minimal  subterxn  nonterminal  —  it  does  not  generate 
a  subterm  of  a  minimal  representative  of  <X,  G>. 

The  procedure  for  constructing  matches  uses  a  simple  extension  of  the  grammar. 

Definition:  An  extended  congruence  grammar  is  a  congruence  gram¬ 
mar  such  that,  for  each  nonterminal  X  appearing  in  a  production  of 
the  grammar,  the  grammar  also  contains  the  production  X  — »  c*. 
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We  define  a  match  to  be  a  triple,  match [y,  a,  er],  where  Y  is  a  nonterminal,  a 
is  a  term,  and  O’  is  a  substitution  such  that  7[o[a]]  is  Y.  Matches  can  be  computed 
by  starting  with  “basic  matches”  and  creating  new  matches  according  to  inference 
rules  that  generate  new  matches  from  old  matches.  For  each  minim*!  subterm 
nonterminal  Y,  and  for  each  variable  x  occurring  in  £,  we  create  the  basic  match 
match[Y,  x,  {x  ►-»  cy}].  To  minimize  the  number  of  basic  matches  it  is  important 
to  rename  the  variables  in  equations  in  £  to  minimize  the  total  number  of  vari¬ 
ables  in  E.  Typically  there  will  be  no  more  than  3  or  4  variables  in  E.  We  also 
create  basic  matches  for  constant  symbols  that  appear  in  E.  For  each  constant  a 
appearing  in  £  we  create  the  basic  match  match[/o[a],  a,  0]  where  0  is  the  empty 
substitution.  More  complicated  matches  can  be  constructed  using  the  following 
rule. 


*atch[li,  *i,  Ti] 


*atch[yw,  sn,  t„] 

s-*/Cri,-,n) 


»atch[Z,  /(si,  •••,  s,),  <r\ 


The  substitutions  are  represented  by  finite  lists  of  variable-value  pairs.  In  all 
the  substitutions  constructed  by  the  matching  procedure  the  values  assigned  to 
variables  are  always  internal  constants.  The  above  rule  only  applies  when  the 
substitutions  Ti  agree  on  all  shared  variables  —  if  r,-  and  Tj  both  contain  a  pair 
assigning  a  value  to  the  variable  x  then  they  must  both  assign  x  to  the  same 
internal  constant.  The  substitution  a  is  simply  the  union  of  the  r,’s.  The  above 
rule  is  also  restricted  to  the  case  where  Z  is  a  minim*!  subterm  nonterminal,  the 
production  Z  — *  /(Yj,  •  •  • ,  Yn)  is  minimal,  and  /(sj,  •  •  • ,  a*)  is  a  term  occurring 
in  £. 

The  restrictions  on  the  above  inference  rule  are  “nonmonotonic”.  The  addi¬ 
tion  of  a  new  production  can  cause  other  productions  to  go  from  being  minim*! 
to  bong  nonmini mal.  When  matches  are  computed  incrementally  as  new  produc¬ 
tions  are  added  it  is  possible  that  previously  computed  matches  become  “obsolete” 
in  the  sense  that  they  were  computed  from  productions  that  have  now  become 
nonminimal.4  The  failure  to  remove  matches  that  are  obsolete  due  to  productions 
that  are  no  longer  minimal  will  cause  minor  overgeneration  of  matches.  The  tim* 
required  to  detect  and  remove  these  obsolete  matches  is  probably  greater  than  the 

4Matches  can  also  become  obsolete  if  nonterminals  or  constant  symbols  involved  in  the  match 
become  "dead”  due  to  generated  equations  between  nonterminals.  Matches  that  are  obsolete  due 
to  references  to  dead  nonterminals  or  dead  constants  are  easily  detected  and  eliminated. 


19 


time  taken  to  perform  any  extra  rewrites  they  cause.  Rewrites  generated  by  these 
obsolete  matches  are  semantically  sound  and  do  no  harm. 

Finally,  we  construct  "equate  relations”  of  the  form  cz  *—*e  v  where  cz  is 
an  internal  constant  and  v  is  a  ground  term  involving  internal  constants.  More 
specifically,  if  we  derive  match[Z,  s,  o]  and  E  contains  either  s  =  t  or  t  =  s 
where  every  free  variable  in  t  occurs  in  a,  then  we  can  derive  cz  *~*E  o|f].  The  set 
of  derived  equate  relations  of  the  form  cM  *—*e  v  provide  all  the  possible  ways  of 
rewriting  the  grammar.  For  each  equate  relation  cz  *-*e  the  grammar  can  be 
rewritten  by  using  the  procedure  of  section  7  to  equate  cz  and  v. 


9  Summary 

Grammar  rewriting  is  motivated  by  the  desire  to  increase  the  success  rate  of  at¬ 
tempts  to  prove  equations  in  nonconfiuent  rewrite  systems.  Intuitively,  the  rewrite 
process  generates  a  set  of  terms  rather  than  an  individual  term,  and  by  generating 
two  sets  of  terms,  rather  than  two  classical  normal  forms,  we  increase  the  likeli¬ 
hood  of  proving  the  desired  equation.  Although  grammar  rewriting  is  primarily 
motivated  by  the  need  to  handle  nonconfluent  theories,  it  also  provides  a  new 
kind  of  canonical  rewriting  system.  Under  grammar  rewriting  there  exist  finite 
canonical  systems  for  equational  theories,  such  as  idempotent  semigroups,  that 
do  not  have  finite  canonical  systems  under  traditional  notions  of  rewriting.  The 
pragmatic  value  of  grammar  rewriting  in  general  purpose  proven,  such  as  [Boyer 
and  Moore,  1979],  has  not  yet  been  adequately  investigated. 
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